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Abstract 

We  consider  inverse  or  parameter  estimation  problems  for  general  nonlinear  nonau- 
tonomous  dynamical  systems  with  delays.  The  parameters  may  be  from  a  Euclidean 
set  as  usual,  may  be  time  dependent  coefficients  or  may  be  probability  distributions 
across  a  population  as  arise  in  aggregate  data  problems.  Theoretical  convergence  re¬ 
sults  for  finite  dimensional  approximations  to  the  systems  are  given.  Several  examples 
are  used  to  illustrate  the  ideas  and  computational  results  that  demonstrate  efficacy  of 
the  approximations  are  presented. 
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1  Introduction 


Delay  differential  equations  have  been  a  topic  of  much  interest  in  the  mathematical  research 
literature  for  more  than  50  years.  Contributions  range  from  classical  applications  and  theo¬ 
retical  and  computational  methodologies  [Ba79,  Ba82,  BBul,  BBu2,  BKap,  BellmanCooke, 
Cushing,  Diekmann,  Driver,  Gorecki,  JKH1,  JKH2,  JKH3,  Kap82,  KapSal87,  KapSal89, 
KapSch,  Kuang,  Minorsky,  Webb,  Wright]  to  modern  applications  in  biology  [BBJ,  BBH, 
MSNP,  NMiP,  NMuP,  NP],  In  this  paper  we  return  to  a  topic  that  has  become  increasingly 
relevant  in  current  research:  a  theoretical  and  computational  approach  for  inverse  problems 
involving  nonlinear  delay  systems.  One  approach  that  is  by  now  classical  dates  back  to 
the  1970’s  [Ba79,  BBul,  BBu2,  BKap].  In  this  approach  one  approximates  solutions  to  the 
infinite  dimensional  state  systems  such  as  (1)  below  by  first  converting  them  to  an  abstract 
evolution  equation  in  a  functional  analytic  state  space  setting.  One  can  approximate  solu¬ 
tions  in  finite  dimensional  subspaces  spanned  by  pre-chosen  basis  elements  (e.g.,  piece-wise 
linear  or  cubic  splines)  in  a  Galerkin  approach  which  is  equivalent  to  a  finite  element  ap¬ 
proximation  framework  (as  is  classically  used  for  partial  differential  equations).  One  is  then 
able  to  numerically  calculate  the  generalized  Fourier  coefficients  of  approximate  solutions 
relative  to  the  splines,  and  with  these  coefficients,  recover  an  approximation  to  the  solutions 
of  delay  systems  (1). 

Here  we  turn  to  the  mathematical  aspects  of  these  nonlinear  FDE  systems  and  present 
an  outline  of  the  necessary  mathematical  and  numerical  analysis  foundations.  Thus  we 
provide  an  extension  (to  treat  time  dependent  coefficients  and  general  parameters  including 
probability  measures)  of  arguments  for  approximation  and  convergence  in  inverse  problems 
found  in  [Ba82], 

For  nonlinear  delay  systems  such  as  those  discussed  here,  approximation  in  the  context 
of  a  linear  semigroup  framework  as  presented  [BBul,  BBu2,  BKap]  is  not  direct.  However 
one  can  use  the  ideas  of  that  theory  as  a  basis  for  a  wide  class  of  nonlinear  delay  system 
approximations.  Details  in  this  direction  can  be  found  in  the  early  work  [Ba79,  Kap82] 
which  is  a  direct  extension  of  the  results  of  [BBul,  BBu2,  BKap]  to  nonlinear  delay  sys¬ 
tems.  The  new  theoretical  results  presented  here  are  extensions  of  these  earlier  ideas  to 
general  nonlinear,  nonautonomous  delay  systems;  specifically  we  extend  the  ideas  of  [Ba82] 
to  treat  nonlinear  systems  with  time  dependent  coefficients  and/or  parameters  that  may  be 
probabilistic  in  nature  (i.e.,  probability  distributions  as  treated  in  [BBoIP,  BBPP]).  Several 
current  application  areas  are  used  to  illustrate  the  theory  with  examples. 

We  consider  general  systems  of  the  form 

z(t)  =  f(t,z(t),zt,z(t-Ti),...,z(t-Tm),q)  +  /2(i),  0  <  t  <  T,  z0  =  4>  (!) 

where/  =  f(t,ri,i/>,yi,...,ym,q)  :  [0,T]xIxlnmxQ  — x  Mn.  Here  X  =  MnxL2(-r,  0;  Mn), 
0  <  Ti  <  . . .  <  Tm  =  r,  zt  denotes  the  usual  function  zt{9)  =  z(t  +  9),  —  r  <  8  <  0,  and 
<f>  G  Hl{— r,  0).  Here  the  admissible  parameter  set  Q  is  a  subset  of  a  metric  space  (possibly 
infinite  dimensional  -  i.e.,  some  set  of  functions). 

Associated  with  this  system  is  an  ordinary  least  squares  [BDSS,  BT]  cost  functional  to 
be  minimized.  That  is,  we  consider  the  problem  of  minimizing  over  q  G  Q  the  ordinary 
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least  squares  output  functional 


J(q,d)  =  ^2\Cz(U’,q)  ~di\2,  (2) 

i=  1 

where  C  is  an  observation  operator  and  {d;}  is  a  given  data  set. 

As  we  shall  see  below,  one  can  rewrite  (1)  as 

x(t)  =  A(t,q)x(t)  +  f{t)  .  . 

x(0)  =  so,  1  j 

for  states  x(t)  =  (z(t),zt)  in  an  abstract  space  X.  One  can  then  develop  theoretical  and 
computational  methodologies  to  treat  finite  dimensional  approximations  in  spaces  XN  and 
QM .  These  ideas  are  the  focus  of  our  presentation  below. 

2  Inverse  or  Parameter  Estimation  Problems 

2.1  Approximation  and  Convergence 

For  more  details  on  general  inverse  problem  methodology  in  the  context  of  abstract  dis¬ 
tributed  systems,  the  reader  may  consult  [BK,  BSW].  The  book  [BK]  contains  a  general 
treatment  of  inverse  problems  for  partial  differential  equations  in  a  functional  analytic 
setting.  Here  we  treat  nonlinear  delay  systems  with  a  general  family  of  probabilistic  pa¬ 
rameters. 

The  minimization  in  general  abstract  parameter  estimation  problems  for  (3)  involves 
an  infinite  dimensional  state  space  X  and  an  infinite  dimensional  admissible  parameter  set 
Q  (generally  of  functions  or  even  probability  distributions).  To  obtain  computationally 
tractable  methods,  we  thus  consider  Galerkin  type  approximations.  Let  XN  be  a  sequence 
of  finite  dimensional  subspaces  of  X ,  and  QM  be  a  sequence  of  finite  dimensional  sets 
approximating  the  parameter  set  Q.  We  denote  by  PN  the  orthogonal  projections  of  X 
onto  XN .  Then  a  family  of  approximating  estimation  problems  with  finite  dimensional 
state  spaces  and  parameter  sets  can  be  formulated  by  seeking  q  £  QM  which  minimizes 

K 

JN(q,d)  =  ^2\CxN(ti;q)  -  di\2,  (4) 


where  xN(t;  q )  €  XN  is  the  solution  to  a  finite  dimensional  approximation  of  (3))  given  by 

xN  (t)  =  AN(t,q)xN(t)  +  PNf(t) 
xN  (0)  =  PNx o- 

For  the  parameter  sets  Q  and  Q M ,  and  state  spaces  XN ,  we  make  the  following  hypotheses. 

(AIM)  The  sets  Q  and  QM  lie  in  a  metric  space  Q  with  metric  d.  It  is  assumed  that  Q 
and  Qm  are  compact  in  this  metric  and  there  is  a  mapping  iM  :  Q  — >•  Qhl  so  that 
qAI  _  Furthermore,  for  each  q  €  Q,  iM(q)  — >•  q  in  Q  with  the  convergence 

uniform  in  q  €  Q. 
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(A2N)  The  finite  dimensional  subspaces  XN  satisfy  the  approximation:  For  each  x  €  X,  \x  — 
PNx\x  — >  0  as  N  — »•  oo. 

Solving  the  approximate  estimation  problems  involving  (4), (5),  we  obtain  a  sequence 
of  parameter  estimates  {qN,M}.  It  is  of  paramount  importance  to  establish  conditions  un¬ 
der  which  {qN,M}  (or  some  subsequence)  converges  to  a  solution  for  the  original  infinite 
dimensional  estimation  problem  involving  (2), (3).  Toward  this  goal  we  have  the  following 
results. 

Theorem  1.  To  obtain  convergence  of  at  least  a  subsequence  of  {qN,M}  to  a  solution  q 
of  minimizing  (4)  subject  to  (5),  it  suffices,  under  assumption  (AIM),  to  argue  that  for 
arbitrary  sequences  {qN,M}  in  QAI  with  qN’M  — >•  q  in  Q,  we  have 

xN(t,qN’M)^x(t-,q).  (6) 

Proof:  Under  the  assumptions  (AIM),  let  {qN,M}  be  solutions  minimizing  (4)  subject 
to  the  finite  dimensional  system  (5)  and  let  qN,M  €  Q  be  such  that  iM(qN,M)  =  qN)M .  From 
the  compactness  of  Q,  we  may  select  subsequences,  again  denoted  by  {qN,M}  and  {qN'M}, 
so  that  qN’M  — )•  q  e  Q  and  qN’M  — y  q  (the  latter  follows  from  the  last  statement  of  (AIM)). 
The  optimality  of  {qN,M}  guarantees  that  for  every  q  €  Q 

JN(qN’M,d)  <  JN(iM(q),d).  (7) 

Using  (6),  the  last  statement  of  (AIM)  and  taking  the  limit  as  N,M  — >•  oo  in  the  inequality 
(7),  we  obtain  J(q,d)  <  J{q,d)  for  every  q  €  Q,  or  that  q  is  a  solution  of  the  problem  for 
(2), (3).  We  observe  that  under  uniqueness  assumptions  on  the  problems  (a  situation  that 
we  hasten  to  add  is  not  often  realized  in  practice),  one  can  actually  guarantee  convergence 
of  the  entire  sequence  {qN,M}  in  place  of  subsequential  convergence  to  solutions. 

We  note  that  the  essential  aspects  in  the  arguments  given  above  involve  compactness 
assumptions  on  the  sets  QM  and  Q.  Such  compactness  ideas  play  a  fundamental  role  in  other 
theoretical  and  computational  aspects  of  these  problems.  For  example,  one  can  formulate 
distinct  concepts  of  problem  stability  and  method  stability  as  in  [BK]  involving  some  type 
of  continuous  dependence  of  solutions  on  the  observations  z,  and  use  conditions  similar 
to  those  of  (6)  and  (AIM),  with  compactness  again  playing  a  critical  role,  to  guarantee 
stability.  We  illustrate  with  a  simple  form  of  method  stability  (other  stronger  forms  are  also 
amenable  to  this  approach-see  [BK]). 

We  might  say  that  an  approximation  method,  such  as  that  formulated  above  involving 
Qm,Xn  and  (4)-(5),  is  stable  if 

dist(qN’M(dk),q(d*))  ^  0 

as  N,M,k  — >•  oo  for  any  dk  — >•  d*  (in  this  case  in  the  appropriate  Euclidean  space),  where 
q(z)  denotes  the  set  of  all  solutions  of  the  problem  for  (2)-(3)  and  qN’M(d)  denotes  the 
set  of  all  solutions  of  the  problem  for  (4)-(5).  Here  “dist”  represents  the  usual  distance  set 
function.  Under  (6)  and  (AIM),  one  can  use  arguments  very  similar  to  those  sketched  above 
to  establish  that  one  has  this  method  stability.  If  the  sets  QM  are  not  defined  through  a 
mapping  iM  as  supposed  above,  one  can  still  obtain  this  method  stability  if  one  replaces 
the  last  statement  of  (AIM)  by  the  assumptions: 
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(i)  If  {qM}  is  any  sequence  with  qM  £  QM ,  then  there  exist  q*  in  Q  and  subsequence 
{ qMk  }  with  qMk  — >  q*  in  the  Q  topology. 

(ii)  For  any  q  £  Q,  there  exists  a  sequence  {qM}  with  qM  £  QM  such  that  qM  — >•  q  in  Q. 

Similar  ideas  may  be  employed  to  discuss  the  question  of  problem  stability  for  the  prob¬ 
lem  of  minimizing  (2)  over  Q  (i.e.,  the  original  problem)  and  again  compactness  of  the 
admissible  parameter  set  plays  a  critical  role. 

Compactness  of  parameter  sets  also  plays  an  important  role  in  computational  considera¬ 
tions.  In  certain  problems,  the  formulation  outlined  above  (involving  QA1  =  iM  (Q))  results 
in  a  computational  framework  wherein  the  QM  and  Q  all  he  in  some  uniform  set  pos¬ 
sessing  compactness  properties.  The  compactness  criteria  can  then  be  reduced  to  uniform 
constraints  on  the  derivatives  of  the  admissible  parameter  functions.  There  are  numerical 
examples  (for  example,  see  [BI86] )  which  demonstrate  that  imposition  of  these  constraints 
is  necessary  (and  sufficient)  for  convergence  of  the  resulting  algorithms.  (This  offers  a  pos¬ 
sible  explanation  for  some  of  the  numerical  failures  [YY]  of  such  methods  reported  in  the 
engineering  literature.) 

The  sets  (spaces)  Q  and  QM  in  the  inverse  problem  framework  above  are  an  important 
component  in  any  problem  formulation  and  may  involve  constant  vector  parameters,  time 
or  spatially  dependent  functions  or  even  probability  measures.  In  many  widely  encountered 
problems  the  set  of  admissible  parameters  Q  may  consist  of  simply  some  compact  subset  of 
finite  dimensional  Euclidean  space.  In  this  case  one  does  not  need  the  additional  family  of 
sets  Qm  in  the  above  theory  (i.e.,  the  above  formulation  and  theory  holds  with  Q =  Q 
for  all  M).  However  in  an  increasing  number  of  applications  (for  example  in  the  three 
examples  outlined  below)  the  parameters  sought  are  functions  of  time  or  space.  Then  one 
often  uses  approximation  families  to  construct  the  family  QM.  For  example,  in  Example  1 
below,  some  of  the  parameters  to  be  estimated  are  time  dependent  coefficients  in  ordinary 
differential  equation  dynamical  systems.  In  this  case  one  might  choose  some  set  of  functions 
Q  on  a  time  interval  [0,  T]  and  then  choose  piecewise  linear  splines  for  the  approximating 
families  QM  (see  [BBJ]  for  details)  and  use  spline  approximation  properties  (e.g.,  see  [BK]) 
to  argue  that  the  conditions  of  (AIM)  hold. 

Problems  with  uncertainty  in  parameters  (or  parameters  representing  some  distribu¬ 
tion  across  a  population  in  the  case  of  aggregate  data  [BBi,  BBH,  BBPP,  BD,  BDTR, 
BDEHADB,  BDEHAD,  BFPZ,  BG1,  BG2,  BPi,  BPo])  pose  even  more  interesting  and 
challenging  possibilities.  Several  choices  may  arise  for  an  underlying  finite  dimensional  Eu¬ 
clidean  set  Q:  (i)  Q  is  a  compact  subset  of  Mp;  (ii)  Q  =  [— r,  0]  is  a  set  of  possible  delay 
times  t  in  some  dynamical  process.  In  these  cases  a  frequent  choice  is 

Q  =  V(Q)  =  {P  :  Q  — »•  M1  :  P  is  a  probability  distribution  on  Q}, 

i.e.,  Q  is  the  set  of  all  probability  distributions  on  Q.  To  investigate  theoretical,  com¬ 
putational  and  approximation  issues  for  these  problems,  it  is  necessary  to  put  a  topology 
on  the  space  of  probability  measures:  a  natural  choice  for  V(Q )  is  the  Prohorov  met¬ 
ric  p  topology  (see  [Bi,  Hu,  P]).  Convergence  in  this  metric  p(Pk,P)  —>  0  is  equivalent 
to  fQgdPk(q)  fQgdP(q)  for  all  bounded,  continuous  g  :  Q  — >•  R1.  Thus  if  we  view 
V(Q)  as  a  subset  of  the  topological  dual  of  the  bounded  continuous  functions  on  Q,  i.e., 
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V{Q)  C  Cb(Q)*,  convergence  in  the  Prohorov  metric  is  equivalent  to  weak*  convergence 
and  the  weak*  topology  is  nretrizable  with  the  Prohorov  metric.  Then  one  must  construct 
families  QM  =  VM(Q)  to  approximate  the  distributions  Q  =  V{Q)  in  the  Prohorov  metric. 

To  pursue  this,  it  is  useful  to  formulate  methods  to  yield  finite  dimensional  sets  VAI  (Q) 
over  which  to  minimize  J(_P).  Of  course,  we  wish  to  choose  these  methods  so  that  UVM  (Q)  — » 
V(Q)n  in  some  sense  so  that  the  conditions  (AIM)  can  be  satisfied.  This  can  be  done  in  the 
context  of  a  framework  one  based  on  the  Prohorov  metric  [BBPP,  BBi]  of  weak*  convergence 
of  measures. 

A  general  theoretical  framework  is  given  in  [BBPP]  with  specific  results  on  the  approx¬ 
imations  we  use  here  given  in  [BBi,  BPi].  Briefly,  ideas  for  the  underlying  theory  are  as 
follows: 

1.  One  argues  continuity  of  P  — >•  J(P)  on  Q  =  V(Q)  with  the  Prohorov  metric  p\ 

2.  If  Q  is  compact  then  Q  =  V(Q)  is  a  complete  metric  space,  indeed  compact,  when 
taken  with  Prohorov  metric; 

3.  Approximation  families  QAI  =  VM (Q)  are  chosen  so  that  elements  PM  €  VM (Q)  can 
be  found  to  approximate  elements  P  €  V(Q)  in  Prohorov  metric; 

4.  Well-posedness  (existence,  continuous  dependence  of  estimates  on  data,  etc.)  is  ob¬ 
tained  along  with  feasible  computational  methods. 

The  desired  results  can  be  developed  using  several  approximation  theories  that  have 
been  recently  developed  and  used  in  the  context  of  problems  other  than  those  with  delay 
systems.  The  first,  developed  in  [BBi]  and  based  on  Dirac  delta  measures,  is  summarized 
in  the  following  theorem. 

Theorem  2.  Let  Q  be  a  complete,  separable  metric  space,  B  the  class  of  all  Borel  subsets  of 
Q  and  V(Q)  the  space  of  probability  measures  on  ( Q,B ).  Let  Q o  =  be  a  countable, 

dense  subset  of  Q.  Then  the  set  of  P  G  V(Q)  such  that  P  has  finite  support  in  Qo  and 
rational  masses  is  dense  inV(Q)  in  the  p  metric.  That  is, 

k  k 

Vo (Q)  =  {P  £  V(Q)  :  P  =  Y,PjAqj,k  €  A f+ ,  qj  €  Qo,  Pj  rational,  ^ pj  =  1} 

3= i  3= i 

is  dense  in  V(Q)  taken  with  the  p  metric,  where  Aqj  is  the  Dirac  measure  with  atom  at  qj. 

It  is  rather  easy  to  use  the  ideas  and  results  associated  with  this  theorem  to  develop  com¬ 
putationally  efficient  schemes.  Given  Q ^  =  Um=i  Qm  with  Qm  =  {qf}f=\  (a  “partition” 
of  Q)  chosen  so  that  Qd  is  dense  in  Q,  define 

M  M 

qM  =  pm(Q)  =  {P  <e  V(Q)  :  P  =  ^2pjAqM,qf  €  QM,  Pj  rational,  ^Pj  =  1}. 

j= 1  i=1 

Then  we  find 

(i)  QhI  =  Vm(Q)  is  a  compact  subset  of  Q  =  V(Q)  in  the  p  metric, 


6 


(ii)  Vm(Q)  C  Vm+1(Q)  whenever  Qm+ i  is  a  refinement  of  Qm, 

(iii)  “VM  (Q)  — >•  V{Q)”  in  the  p  topology;  that  is,  for  M  sufficiently  large,  elements  in  V(Q) 
can  be  approximated  in  the  p  metric  by  elements  of  VM . 

A  second  class  of  approximations  was  developed  and  used  in  [BPi]  for  problems  where 
one  assumes  that  the  probability  distributions  to  be  approximated  possess  densities  in  L2. 
These  involves  approximation  with  piecewise  linear  splines  at  the  level  of  the  densities. 

Theorem  3.  Let  F  be  a  weakly  compact  subset  of  L2(Q),  Q  compact  and  let  Vf{Q)  = 
{P  €  V{Q)  :  P'  =  p,  p  G  F}.  Then  Q  =  Vj^(Q)  is  compact  in  Q  =  V(Q)  in  the  p  metric. 
Moreover,  if  we  define  {ij4}  to  be  the  linear  splines  on  Q  corresponding  to  the  partition 
Qm,  where  (J mQm  is  dense  in  Q,  define 

VM  =  {pM  :pM  =  J2  bflf,  bf  rational  } 
j 

and  if 

QM  =  Vtm  =  {PM  €  V(Q)  :  PM  =  j pM ,  pM  €  VM }, 
we  have  |JM  Pj-m  is  dense  in  Q  =  Vp(Q)  taken  with  the  p  metric. 

A  study  comparing  the  relative  strengths  and  weaknesses  of  these  two  classes  of  approx¬ 
imation  schemes  in  the  context  of  inverse  problems  is  given  in  [BD]. 

Thus  we  have  that  compactness  of  admissible  parameter  sets  play  a  fundamental  role  in 
a  number  of  aspects,  both  theoretical  and  computational,  in  parameter  estimation  problems. 
This  compactness  may  be  assumed  (and  imposed)  explicitly  as  we  have  outlined  here,  or  it 
may  be  included  implicitly  in  the  problem  formulation  through  Tikhonov  regularization  as 
discussed  for  example  by  Kravaris  and  Seinfeld  [KS],  Vogel  [Vog]  and  widely  by  many  others. 
In  the  regularization  approach  one  restricts  consideration  to  a  subset  Qi  of  parameters 
which  has  compact  embedding  in  Q  and  modifies  the  least-squares  criterion  to  include  a 
term  which  insures  that  minimizing  sequences  will  be  Q\  bounded  and  hence  compact  in 
the  original  parameter  set  Q. 

After  this  short  digression  on  general  inverse  problem  concepts,  we  return  to  the  con¬ 
vergence  (6). 

2.2  State  Approximation  and  Convergence  for  Nonlinear  Systems 

We  consider  the  general  system 

z{t)  =  f(t,z(t),zt,z(t  -  Ti),. . .  ,z(t  -  Tm),q)  +  f2(t),  0<t<T,  zq  =  <j)  (8) 

where/  =  f{t,7j,ip,y  1,...,ym,q)  :  [0,T]xIxlnmxQ  -»•  MV  Here  A  =  MnxL2(-r,  0;  Mn), 
0  <  T\  <  . . .  <  Tm  =  r,  zt  denotes  the  function  zt(0)  =  z(t  +  9),  —  r  <  0  <  0,  and 
<f>  €  Hl{— r,  0).  We  shall  make  use  of  the  following  hypotheses  throughout  our  presentation. 
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(HI)  The  function  /  satisfies  a  global  Lipschitz  condition: 


-  fpt,£,ip,wi,...,wm,q) \< 

K  (\y  -  £|  +  \ip  -  i>\  +  YT=  i  I  Vi  -  wi\) 

for  some  fixed  constant  K  and  all  pq,  ip,yi, . . . ,  ym),  (£,ip,wi, . . , ,  wm )  in  X  x  Mnm 
uniformly  in  t  and  in  q  €  Q. 

(H2)  The  function  /(•,  --,g)  :  [0,T]  x  X  x  Mnm  -»  M"  is  differentiable  for  each  q. 

(H3)  The  function  q  — >•  /(•,  g)  is  continuous  on  Q. 

Remark  1.  If  we  define  the  function  F  :  [0,  T]  x  Mn  x  C(— r,  0;  Mn)  xQc  [0,T]xXxQ^  Mn 
given  by 

F(t,  x,  q)  =  F(t ,  r),  ip,  q)  =  f(t,  rj,  ip,  ip(~n), ...,  ip{-Tm),q)  (9) 

we  observe  that  even  though  /  satisfies  (HI),  F  will  not  satisfy  a  continuity  hypothesis  on 
its  domain  in  the  X  norm. 

We  define  the  nonlinear  operator  Apt]  q)  :  T>{A)  C  X  — >•  X  by 

V(A)  =  {(iP(0),iP)\iPeH\-r,0)} 

Apt]  q)(ip(0),ip)  =  (F(t,ip(0),ip,q),Dip) 

where  here  Dip  =  ip1 .  Note  that  T>(A)  is  independent  of  t  and  q.  We  then  may  write  the 
system  (8)  in  abstract  form 


x(t)  =  Apt]q)xpt)  +  (f2pt),0)  nm 

x  (o)  =  c  =  (0(O),<^) ,  liUj 

for  states  xpt)  =  (zpt),zt)  in  the  abstract  space  X. 

Theorem  4.  Assume  that  (HI)  holds  and  let  xpt](p,f2)  =  (zpt]  (p,  f2),  zt(<p,  /2)),  where  z  is 
the  solution  of  (8)  corresponding  to  cp  €  Hl ,  f2  €  L2.  Then  for  (  =  ((p(0),cp),  x(t]<p,f2)  is 
the  unique  solution  on  [0,  T]  of 

x(t]q)  =  C+  /  [A(cr,q)x((T]q)  +  (f2(a),0)]da.  (11) 

Jo 

Furthermore,  f2  xpt]  (p,  f2)  is  weakly  sequentially  continuous  from  L2  (weak)  to  X  (strong). 

These  results  can  be  established  (we  do  not  do  so  here)  in  one  of  several  routine  ways: 
fixed  point  theorem  arguments  [JKH4]  or  Picard  iteration  arguments.  Either  of  these  ap¬ 
proaches  can  be  used  to  establish  existence,  uniqueness  and  continuous  dependence  of  so¬ 
lution  of  (11).  For  existence,  uniqueness  and  continuous  dependence  of  solution  of  (8), 
we  note  that  our  condition  (HI)  is  a  global  version  of  the  local  hypothesis  of  Kappel  and 
Schappacher  in  [KapSch],  so  that  their  results  also  yield  immediately  the  desired  result  for 
(8)  in  the  autonomous  case. 


The  uniqueness  of  solutions  to  (11)  follows  in  the  usual  manner  once  we  establish  that 
A  satisfies  a  dissipative  inequality.  We  do  this  in  a  space  Xg  that  is  topologically  equivalent 
to  X.  Renorm  X  by  the  weighting  function  g  defined  on  [— r,  0),  where  g(£)  =  j  for 
£  €  [—Tm-j- |_i,  — Tm—j ) ,  j  =  1,2 ,m  (we  define  tq  =  0).  Define  the  Hilbert  space  Xg  = 
Rn  x  L2{— r,  0;  g;  Rn)  to  be  the  elements  of  X  with  this  new  inner  product 

<  (C,0)  >x9=<  v,C  >R"  +  /  (12) 

J  —r 

This  gives  rise  to  an  equivalent  topology  to  that  of  X  as  long  as  <?(£)  >  0  for  all  £  €  [— r,  0) 
(see  [BBu2,  p.  186],  [BKap,  Webb]  for  more  details).  Then  the  nonlinear  operator  A(P,q) 
is  dissipative  if  for  some  k  >  0  we  have 

(A(P,  q)x  -  A(t ;  q)w,  x  -  w)Xg  <  n{x  -  w,x  -  w)Xg  (13) 

for  all  x ,  w  €  T>(A)  and  all  t  and  q  E  Q.  This  can  be  used  to  immediately  argue  uniqueness 
of  solutions.  We  outline  the  arguments  to  establish  the  fundamental  inequality  (13).  We 
have  for  x  =  (0(0),  0),  w  =  (0(0),  0) 

<  A(t;  q)x  -  A(t;  q)w ,  x-w  >Xg  =  <  F(t ,  0(0),  0)  -  F(t,  0(0),  0),  0(0)  -  0(0)  >R» 

+  <  T»0  -  T»0,  0-0  >L2(— r,0;g;R") 

=  <  -^(0  0(0),  0)  -  T(t,  0(0),  0),  0(0)  -  0(0)  >Rn 

+  [  D(0(£)  -  0(£))(0(£)  -  0(£)M£R 

J  — r 

=  <  F(t,  0(0),  0)  -  F(t,  0(0),  0),  0(0)  -  0(0)  >R» 

m  „_T 

_  /  '  m—j 

+ y  /  ^(0(0  -  ^(0x0(0  - 

7  =  1  J—Tm-j  + 1 

(14) 

Consider  the  last  term  and  denote  A m_j  =  (0  —  0)(rTO_j)  =  0(— Tm~j)  —  for 

j  =  0, 1, . . . ,  m.  Then  for  A(£)  =  0(£)  —  0(£)  we  have 
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m 


E 


'm—j 


m 


1=1 

m  m— 1 

j=l  /c=0 

^  m—  1 

^m|A0|2  +  -  j|Am_j 

l=i 

m— 1 

E 

J  =  1 
m—  1 


£>(a(o)a(o^(o^ 

f'ijA(02|!=_Tm-j 

Z_-/  VSy  s=  ^Vn— j+l 
3  = 1 
m  /  . 

Z  (  ^lAm-lP  -  2l^m_J+1l" 

1  m—1 

^  + 1)lAm-fcl2  (forfc  =  J-1) 

1  m_1  1 
-J>  +  l)|Am_fc|2--|A, 

z  fc=l 

±m|A„|2  -  )  |Am_/ -  l|Am|2 

1  m 

|A0|2--]T|A, 


=  |™|A„|2-i^|A 


2' 
m  +  1 . 


m-j  | 


xm—j  | 


l=o 


1=0 


Returning  to  (14),  we  have 
<  A(t;q)x  —  A(t;q)w,x  —  w  >xg  < 


< 


< 


< 

< 

< 

< 


<  -F(i,  0(0),  0)  -  -F(i,  0(0),  0),  0(0)  -  0(0)  >i« 


+- 


m  +  1 


1  m 

|Ao|2-?ElA 


m—j  | 


1=0 


jqiAol  +  IAI  +  jriAil  |A0| 

i=l  / 

|A0|2~£> 


+ 


V 

m  +  1 . 


9  i  -*m-j  I 
1=0 


A-|Ao|2  +  0|A|2  +  !|A„|2  +  i£>,|2  + 


m,K2 


i= 1 


+  - 


m  +  1 


1  m 

|a»I2-?E|a 


m-j  | 


1=0 


R  + 


mK 2  m  +  1  \ 


4 - 9 —  )  |Ao|2  +  — |A| |2 


*  +  =£  +  =±1')|a& 


K  X  —  W 


X 


K\X~W\Xg, 


|A0|2 
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using  the  definition  of  the  inner  product.  Therefore  choosing 


k  =  K  T 


mK 2 
2 


+ 


m  +  1 
2 


we  have  that  *4(t;  q)  is  dissipative  in  Xg.  uniformly  in  t  and  q  . 

Turning  next  to  the  approximation  of  (8)  through  approximation  of  (11),  we  let  XN  be 
the  spline  subspaces  of  X  discussed  in  detail  in  [BKap].  We  briefly  outline  the  results  for 
the  piecewise  linear  subspaces  X ±  (see  Section  4  of  [BKap])  given  by 


X^  =  {(0(0),  4>)  |  (f>  is  a  continuous  first-order  spline  function 

with  knots  at  =  —jr/N,j  =  0, 1, . . . ,  N}. 

A  careful  study  of  the  arguments  behind  our  presentation  reveals  that  the  approximation 
results  given  here  hold  for  general  spline  approximations.  For  example,  if  one  were  to  treat 
cubic  spline  approximations  (X%  of  [BKap]),  one  would  use  the  appropriate  approximation 
analogues  of  Theorem  2.5  of  [Schultz]  and  Theorem  21  of  [SchuVarg]  (e.g.,  see  Theorem  4.5 
of  [Schultz]).  Hereafter,  when  we  write  XN ,  the  reader  should  understand  that  we  mean 
X?  of  [BKap], 

Let  PA  =  Pg  be  the  orthogonal  projection  (in  ( ,  )g  =  ( ,  )xg)  of  X  onto  XN  so  that  as 
we  have  already  discussed  it  immediately  follows  that  PN x  — >•  x  for  all  x  €  X.  Similar  to 
the  approach  in  [BKap]  as  extended  in  [Ba82],  for  arbitrary  {qN}  with  qN  — >•  q  we  define 
the  approximating  operator 

AN(t)  =  PNA(t;qN)PN 

and  consider  the  approximating  equations  in  XN  given  by 

xN(t)  =  PN(  +  f  [AN  [a)xN  (a)  +  PN(f2(a),0)]da  (15) 

Jo 

which,  because  XN  is  finite-dimensional,  are  equivalent  to 

xN(t)  =  AN(t)xN(t)  +  PNm),  o),  *"(0)  =  PN(.  (16) 

Note  that  xN (t)  =  xN(t;QN).  From  (13)  and  the  definition  of  AN  in  terms  of  the  self- 
adjoint  projections  PN ,  we  have  at  once  that  under  (HI)  the  sequence  {0^}  satisfies  on  X 
a  uniform  dissipative  inequality 


{AN (t)x  —  An (t)w,  x  —  w)g  <  k(x  —  w,  x  —  w)g. 


(17) 


Uniqueness  of  solutions  of  (15)  then  follows  immediately  from  this  inequality.  Upon  recog¬ 
nition  that  (16)  is  equivalent  to  a  nonlinear  ordinary  differential  equation  in  Euclidean  space 
with  the  right-hand  side  satisfying  a  global  Lipschitz  condition,  one  can  easily  argue  exis¬ 
tence  of  solutions  for  (16)  and  hence  for  (15)  on  any  finite  interval  [0,  T\.  Our  main  result 
to  be  discussed  here,  which  ensures  that  solutions  of  (16)  converge  to  those  of  (8),  can  now 
be  stated. 


Theorem  5.  Assume  (HI),  (H2),  (H3)  and  qN  — »  q  in  Q.  Let  (  =  (0(0),  0),  4>  €  H1  and 
/2  €  £2(0,  T)  be  given,  with  xN  and  x  the  corresponding  solutions  on  [0,T]  of  (16)  and 
(8),  respectively.  Then  xN (t)  — >•  x(t)  =  (z(t;  0,  {2),  zt((j>,  {2)),  as  IV  — >•  00,  uniformly  in  t  on 

[0,  T] . 
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Remark  2.  One  can  actually  obtain  slightly  stronger  results  than  those  given  in  Theorem  5. 
One  can  consider  solutions  of  (8)  and  (16)  corresponding  to  initial  data  (2(0),  zq)  =  (77,  <f>)  = 
£  with  7 1  €  Mn,  (j>  G  L2  (i.e.,  (  €  X)  and  argue  that  the  results  of  Theorem  5  hold  also  in 
this  case. 

To  indicate  briefly  our  arguments  for  Theorem  5,  we  consider  for  given  initial  data 
(  and  perturbation  /2  the  corresponding  solutions  x  and  xN  of  (11)  and  (15).  Define 
A N(t)  =  xN(t )  —  x(t)  and  F'At)  =  (/2(f),  0),  we  obtain  immediately  that 

AN (t)  =  (PN  -  I)(  +  [  [AN {cr)xN (a)  -  A(a)x(a)  +  (PN  -  I)F2(a)\  da.  (18) 

Jo 

We  next  use  a  rather  standard  technique  for  analysis  of  differential  equations  (see  [Barbu]), 
the  foundations  of  which  we  state  as  a  lemma  since  we  shall  refer  to  it  again. 

Lemma  3.  If  X  is  a  Hilbert  space  and  x  :  [a,  6]  — >•  X  is  given  by 

x(t)  =  x(a)  +  /  y(a)dcr, 

J  a 

then 

{x(a),y(a))da. 

This  lemma  is  essentially  a  restatement  of  the  well-known  result  [Barbu,  p.  100]  that  in 
a  Hilbert  space 

Q|*(0I2)  =  {x(t),x(t)). 

Applying  Lemma  3  to  (18),  we  obtain 
|A^(t)|2  =  \(PN-I)(  |2 

+2  Jl)(AN (a)xN (a)  -  A(cr)x(a)  +  (. PN  -  /)F2(<j),  AN(a))dcr 
=  \(PN  -  I)( |2  +  2  Jl{AN {cr)xN (a)  -  AN(a)x(a),AN(a))da 
+2  fo  ((AN (cr)  —  A(ct))x(<j)  +  (PN  -  I)F2{a),AN{a))d<j. 
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If  we  use  (17)  on  the  first  integral  term  in  this  last  expression,  we  then  have 


|AAr(t)|2  <  \(PN  —  I)(\2  +  2 /q  lo\An  (a)\2da 

+2  f*{(AN (a)  -  A(a))x(a)  +  (PN  -  I) F2(a),AN (a)) da 
<  | (PN  —  I)C\2  +  2  /q  uj\An  (a)\2da 

+2  /o  [^K-47V(cr)  ~  A°))x{cr)\2  +  ^A^o-)!2 
+\\(PN-I)F2(a)\2  +  \\A"(a)\2]da 
=  |(^A'  -  ^)C|2  +  Jo  |(^(o-)  -  -A(a))x(a)\2  da  +  JjJ  |(PiV  -  I)F2{a)\2  da 

+2(uj  +  1)  |A N(a)\2da. 

An  application  of  Gronwall’s  inequality  to  this  then  yields  the  estimate 

I  AiV(t)|2  <  [ei(M)  +  e2(N)  +  e3(lV)]  exp  (2(u  +  1  )t) ,  (19) 

where 

e1(lV)  =  |(PiV-I)C|2, 

£2 (A)  =  /  |(AjV(a)  -  A(cr))x(cj)|2(icj, 

Jo 

es(N)=  [T  \(PN  -  I)F2(a)\2  da. 

Jo 

Since  PN  — >•  I  strongly  in  X  and  the  convergence  \(PN  —  I)F2(a)\  — >•  0  in  e3  is  dominated, 
to  prove  Theorem  5  it  suffices  to  argue  that  e2 (N)  — y  0  as  N  — y  oo.  To  that  end,  we  state 
the  following  sequence  of  lemmas. 

Lemma  4.  Assume  (HI),  (H3)  and  let  X  =  {x  =  (0(0 ),<j>)  |  0  €  H2}.  Then  for  each  t, 
An (t)x  — »•  A(t)x  as  N  — »•  oo  for  each  x  €  X. 

Lemma  5.  For  fixed  q  €  Q,  let  Cq  =  {((,f2)  €  T>(A)  X  L2(0,T)  \  0  €  H2,f2  €  H1,  with 
0(0)  =  F(0,(,q)  +  f2(0)  where  (  =  (0(0),  0)}.  Assume  that  (HI),  (H2)  hold.  Then  for 
(O/2)  €  Cq  the  corresponding  solution  a  — >•  x(a)  =  (z(a),zr7)  of  (11)  (z  is  the  solution  of 
(8) )  satisfies  x(a)  €  X  for  each  a  €  (0,  T]. 

Lemma  6.  Assume  (HI),  (H2),  (H3)  and  let  (C/2)  €  Cq  with  xN  and  x  the  corresponding 
solutions  of  (15)  and  (11).  Then  xN (t)  — >•  x(t)  uniformly  in  t  on  [0,T]. 

Lemma  7.  Assume  (HI).  Then  the  solutions  of  (11)  and  (15)  depend  continuously  (in  the 
X  x  L2  topology)  on  (C/2)  G  dJ(A)  x  L2,  uniformly  in  t  on  [0, T). 

Lemma  8.  For  each  q  G  Q,  the  set  Cq  defined  in  Lemma  5  is  dense  in  T>(A)  x  L2  C  X  x  L2. 
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We  obtain  the  convergence  of  Theorem  5  by  combining  Lemmas  6,  7  and  8.  The  proof 
of  Lemma  7  employs  Lemma  3  along  with  Gronwall’s  inequality  in  much  the  same  way  as 
above  in  deriving  (19)  from  (18).  We  note  that  Lemma  5  requires  hypothesis  (H2)  in  order 
to  obtain  enough  smoothness  of  solutions  x  of  (11)  so  that  x(a)  €  X  for  each  a,  which  then 
permits  the  convergence  arguments  of  Lemma  6. 

In  developing  the  estimates  to  establish  Lemma  6  (which,  by  our  above  remarks,  requires 
only  that  we  argue  62{N)  — >  0),  we  use  heavily  the  standard  spline  estimates  found  in 
[Schultz]  and  [SchuVarg],  Lemmas  4  and  5  yield  that  AN  (a)x(cr)  — >•  A(a)x(a)  for  each  a 
so  that  to  prove  Lemma  6  one  only  need  show  that  this  convergence  is  dominated,  thereby 
guaranteeing  £2 (N)  — >•  0.  In  making  the  arguments  for  Lemma  6,  one  obtains  at  the 
same  time  error  estimates  on  the  convergence  in  Theorem  5.  For  example,  one  readily 
finds  the  following:  for  0  €  H 2,  /  satisfying  (HI),  (H2),  0(0)  =  F(0( 0),<f>)  and  02  =  0,  the 
convergence  xN(t)  — >•  x(t)  is  0(1/N).  For  higher-order  splines  and  higher-order  convergence 
estimates  (e.g.,  cubic  splines  with  convergence  0(1/IV3)),  one  of  course  needs  additional 
smoothness  (beyond  (H2))  on  /. 

The  convergence  given  in  Theorem  5  yields  state  approximation  techniques  for  nonlinear 
FDE  systems  based  on  the  spline  methods  developed  in  [BKap] .  These  results  can  be  applied 
directly  to  control  and  identification  problems,  the  latter  of  which  are  discussed  in  [Ba82], 

Remark  9.  Results  for  special  classes  of  the  systems  above  can  actually  be  obtained  from 
the  arguments  for  nonautonomous  nonlinear  delay  systems  in  [Ba79].  In  that  approach, 
one  requires  all  discrete  delays  to  appear  in  the  linear  part  of  the  system  dynamics  while 
continuous  delays  may  appear  in  the  nonlinear  part.  One  then  writes  the  system  dynamics 
as  an  autonomous  linear  part  plus  a  nonlinear  perturbation.  The  linear  part  generates  a 
linear  semigroup  as  in  [BBul,  BBu2,  BKap].  One  then  uses  the  linear  semigroup  in  a  vari¬ 
ation  of  parameters  implicit  representation  of  the  solution  to  the  nonlinear  system.  Math¬ 
ematical  tools  used  then  are  Picard  iterates  for  existence,  and  the  Trotter-Kato  theorem 
[BBul,  BBu2,  BKap]  (for  the  linear  semigroup)  plus  a  Gronwall  inequality.  An  alternative 
(and  more  general)  approach  given  in  [Ba82]  eschews  use  of  the  Trotter-Kato  theorem  in 
treating  general  nonautonomous  nonlinear  delay  systems  which  allows  discrete  delays  in  the 
nonlinear  components  of  the  system.  As  we  have  seen,  the  main  mathematical  tools  are 
dissipative  properties  of  the  general  nonlinear  operator  A(t)  representing  the  system  and 
direct  approximation  AN (t)  — >  A(t)  (no  Trotter-Kato!)  along  with  a  Gronwall  inequality. 
Finally,  a  nonlinear  semigroup  (with  dissipative  generators)  approach  along  with  a  corre¬ 
sponding  nonlinear  Trotter-Kato  convergence  result  are  given  by  Kappel  in  [Kap82].  With 
these  results  one  can  treat  directly  general  autonomous  nonlinear  delay  systems  in  the  spirit 
of  a  linear  semigroup  approach  [Pa]. 

3  Examples 

We  present  three  examples  from  diverse  applications  to  illustrate  use  of  the  above  theo¬ 
retical  and  computational  framework.  The  first  two  applications  have  been  discussed  in 
detail  in  other  presentations  and  hence  we  give  only  brief  summaries.  However,  the  third 
example  is  particularly  novel,  representing  current  efforts  in  an  area  where  dynamical  math¬ 
ematical  modeling  is  virtually  nonexistent  in  the  research  literature.  Thus  we  illustrate  the 
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above  theory  in  a  little  more  detail  for  this  application  and  present  some  specific  new  and 
preliminary  findings. 

3.1  Example  1:  Insect/Insecticide  Models 

We  describe  here  a  non-autonomous  delay  system  arising  in  insect /insecticide  investigations 
[BBDS2,  BBJ],  Mathematical  models  that  are  suitable  for  field  data  with  mixed  popula¬ 
tions  should  consider  reproductive  effects  and  should  also  account  for  multiple  generations, 
containing  neonates  (juveniles)  and  adults  and  their  interconnectedness.  This  suggests  the 
need  at  the  minimum  for  a  coupled  system  of  equations  describing  two  separate  age  classes. 
Additionally,  due  to  individual  differences  within  the  insect  population,  it  is  biologically  un¬ 
realistic  to  assume  that  all  neonate  aphids  born  on  the  same  day  reach  the  adult  age  class 
at  the  same  time.  In  fact,  the  age  at  which  the  insects  reach  adulthood  varies  from  as  few 
as  five  to  as  many  as  seven  days.  Hence  one  must  include  a  term  in  any  model  to  account 
for  this  variability,  leading  one  to  develop  a  coupled  differential  equation  model  including 
distributed  delays  for  the  insect  population  dynamics.  We  consider  the  delay  between  birth 
and  adulthood  for  neonate  pea  aphids  and  present  a  mathematical  model  that  treats  this 
delay  as  a  random  variable. 

Let  a(t)  and  n(t)  denote  the  number  of  adults  and  neonates,  respectively,  in  the  popu¬ 
lation  at  time  t.  We  lump  the  mortality  due  to  insecticide  into  one  time  varying  parameter 
pa(t)  for  the  adults,  pn(t)  for  the  neonates,  and  denote  by  da(t)  and  dn(t)  the  background 
or  natural  mortalities  for  adults  and  neonates,  respectively.  We  let  6(f)  be  the  time  varying 
rate  at  which  neonates  are  born  into  the  population. 

We  suppose  that  there  is  a  time  delay  for  maturation  of  a  neonate  to  adult  life  stage. 
We  further  assume  that  this  time  delay  varies  across  the  insect  population  according  to 
a  probability  distribution  P(t)  for  r  £  [— Tn,0]  with  corresponding  density  k{r)  =  dl ^  ■ 
Here  we  tacitly  assume  an  upper  bound  on  Tn  for  the  maturation  period  of  neonates  into 
adults.  Thus,  we  have  that  k{r),  r  <  0,  is  the  probability  per  unit  time  that  a  neonate  who 

has  been  in  the  population  — r  time  units  becomes  an  adult.  Then  the  rate  at  which  such 

neonates  become  adults  is  n(t  +  r)fe(r).  Summing  over  all  such  r’s,  we  obtain  that  the  rate 
at  which  neonates  become  adults  is  f°T  n(t  +  r)k{r)dT.  Using  the  biological  knowledge 
that  the  maturation  process  varies  between  five  and  seven  days  (i.e. ,  k  vanishes  outside 
[-7,-5]),  we  obtain  the  functional  differential  equation  (FDE)  system 

(|(f)  =  n(t  +  r)fc(r)dr  -  (da(t)  +  pa(t))  a(t) 

%jr(t)  =  b(t)a(t )  -  (dn(t)  +  Pn(t ))  n(t)  -  fz75  n(t  +  r)fc(r)dr  /2q\ 

a{0)  =  $(0),  n(0)  =  &(0),  0G  [-7,0) 

a(  0)  =  a0,  n(0)  =  n°, 

where  k  is  now  a  probability  density  kernel  which  we  have  assumed  has  the  property  k(r)  >  0 
for  r  £  [—7,  —5]  and  k{r)  =  0  for  r  €  (— oo,  —7)  U  (—5,  0]. 

In  this  problem  the  parameters  to  be  estimated  are  time  dependent  coefficients 
da(t) , pa(t) ,  b(t) ,  dn(t),pn(t)  as  well  as  the  probability  density  maturation  kernels  k(r). 

In  summary,  the  authors  of  [BBJ]  present  a  time  delay  differential  equation  model  with 
time  dependent  parameters  as  well  as  probability  density  maturation  kernels  that  might  be 
used  to  investigate  mixed  neonate/adult  multi-generational  populations.  The  formulation 
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includes  these  models  as  special  cases  of  a  class  of  abstract  differential  equations  with 
function  space  parameters  (including  probability  densities)  which  are  readily  approximated 
by  finite  element  systems  (the  spaces  QAI  are  piecewise  linear  splines  for  time  varying 
coefficients  as  well  as  density  kernels).  The  inverse  problems  are  formulated  in  the  context 
of  both  ordinary  and  generalized  least  squares  frameworks,  and  computations  are  carried  out 
(including  an  uncertainty  analysis  with  confidence  intervals  via  asymptotic  error  analysis 
involving  approximate  sampling  distributions)  with  simulated  noisy  data  to  demonstrate 
both  efficiency  and  efficacy  of  the  methodologies. 

3.2  Example  2:  HIV  Infection  Dynamics 

We  next  consider  classes  of  nonlinear  functional  or  delay  differential  equation  models  which 
arise  in  attempts  to  describe  temporal  delays  in  HIV  pathogenesis.  These  models,  first 
developed  in  [BBH]  consider  incorporation  of  variability  (i.e.,  general  probability  distribu¬ 
tions)  for  these  delays  into  systems  that  cannot  readily  be  reduced  to  a  finite  number  of 
coupled  ordinary  differential  equations  (as  is  done  in  the  method  of  stages).  In  [BBH],  the 
authors  introduced  several  classes  of  nonlinear  models  (including  discrete  and  distributed 
delays),  and  presented  discussions  of  theoretical  and  computational  approaches.  The  models 
were  validated  with  in  vitro  experimental  data  [RWE]  in  successful  inverse  problem  efforts. 
This  was  supported  by  statistical  significance  tests  for  the  importance  of  including  delays 
in  the  dynamics. 

The  underlying  biology  is  discussed  in  some  detail  in  [BBH]  to  which  we  refer  interested 
readers.  Viruses  are  obligate  intra-cellular  parasites  with  a  multitude  of  pathways  for  infect¬ 
ing  and  reproducing  within  their  target  hosts.  The  Human  Immunodeficiency  Virus  (HIV) 
is  a  lentivirus  that  is  the  etiological  agent  for  the  slow,  progressive,  and  fatal  Acquired 
Immunodeficiency  Syndrome  (AIDS)  for  which  there  is  currently  no  known  cure. 

For  HIV,  the  core  of  the  virus  is  composed  of  single-stranded  viral  RNA  and  protein 
components.  As  depicted  in  Figure  1,  when  an  HIV  virion  comes  into  contact  with  an 
uninfected  CD4  target  cell,  the  viral  envelope  glycoproteins  fuse  to  the  cell’s  lipid  bilayer 
at  a  CD4  receptor  site  and  the  viral  core  is  injected  into  the  cell.  Once  inside,  the  protein 
components  enable  transcription  and  integration  of  the  viral  RNA  into  viral  DNA  and  then 
incorporation  into  the  cellular  DNA  (provirus).  With  its  altered  cellular  DNA,  the  cell 
produces  capsids  and  protein  envelopes  and  transcribes  multiple  copies  of  viral  RNA.  The 
cell  assembles  a  virion  by  then  encasing  the  newly  produced  viral  RNA  in  a  capsid  followed 
by  a  protein  envelope.  The  new  HIV  virion  pushes  out  through  the  cell  membrane  budding 
off  in  chains  of  virions  (though  sometimes  single  virions  do  float  away  into  the  plasma). 
The  time  from  viral  infection  to  viral  production  (sometimes  called  the  eclipse  phase)  is 
not  instantaneous,  and  (as  indicated  in  the  figure)  it  is  estimated  that  the  first  viral  release 
occurs  approximately  24  hours  after  the  initial  infection. 

Within  the  HIV  modeling  community,  there  has  been  considerable  debate  on  the  proper 
compartment  definitions  in  models.  The  multi-compartment  model  introduced  and  em¬ 
ployed  in  [BBH]  describes  pathways  from  the  moment  a  virion  contacts  the  appropriate 
receptor  site  as  the  beginning  of  acute  infection.  If  the  acutely  infected  cell  survives  through 
its  first  viral  release,  roughly  3  hours  later  the  physiological  characteristics  of  the  cell  change 
and  it  is  subsequently  classified  as  a  chronically  infected  cell.  Note  that  in  the  chronic  stage, 
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Figure  1:  HIV  Infection  Pathway 
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it  is  possible  for  the  cells  to  continue  to  divide  (albeit  at  a  much  slower  rate  than  acutely 
infected  or  non-infected  cells)  and  to  produce  virions. 

In  the  course  of  developing  the  models,  one  employs  a  delay  to  mathematically  represent 
the  temporal  lag  between  the  initial  viral  infection  and  the  first  release  of  new  virions.  We 
concentrate  on  the  mathematical  modeling  of  viral  dynamics,  focusing  in  particular  on  the 
mathematical  aspects  and  biological  nature  of  the  delays  in  primary  infection.  The  models 
are  extensions  of  previous  modeling  work  on  HIV  infection  dynamics  for  in  vitro  laboratory 
experiments  from  the  (continuous)  delay  differential  equations  developed  in  [BGHKNS], 
which  in  turn  were  based  on  a  discrete  dynamical  system  from  [HE],  Our  primary  interest 
here  is  to  present  the  functional  differential  equations  required  when  treating  cellular  level 
data  containing  significant  variability  as  a  specific  example  to  which  the  theory  developed 
in  this  paper  is  applicable.  In  this  example,  a  major  part  of  the  efforts  involved  estimation 
of  parameters  that  are  probability  densities. 

A  central  focus  of  the  modeling  efforts  have  been  on  attempting  to  obtain  reasonable 
mathematical  representations  of  these  delays.  The  problem  of  how  to  mathematically  repre¬ 
sent  these  phenomena  is  decidedly  nontrivial  and  includes  issues  such  as  how  to  account  for 
intra-individual  variability  (e.g.,  intercellular  variability  arising  within  a  single  infected  in¬ 
dividual  or  laboratory  assay)  and/or  inter-individual  variability  arising  between  individual 
subjects  or  data  from  multiple  assays.  These  issues  are  highly  significant  and  dealing  with 
the  levels  of  variability  and  the  resulting  mathematical  ramifications  is  of  primary  interest. 

The  basic  model  involving  delays  has  the  form 


V(t)  =  -cV(t) +nAMt  ~  Ti) +ncC(t)  -  pV(t)T(t) 

A(t)  =  (rv  -  8a  ~  SX(t))A(t)  -  -/A(t  -  n  -  r2)  +  pV(t)T(t) 
C(t)  =  (rv  -  5c  -  5X(tj)C(t)  +  jA(t  —  n  -  r2) 

T(t)  =  (ru  -  8U  -  5X(t)  -  pV(t))T(t )  +  5  , 


where  the  state  variables  are  given  by  V  =infectious  viral  population,  A=acutely  infected 
cells,  C=chronically  infected  cells,  T=uninfected  or  target  cells,  X  =  A  +  C  +  T=total 
cell  population  (infected  and  uninfected),  and  the  parameters  are  given  in  Table  1.  In  this 
model,  it  is  assumed  that  the  delays  ti,  72  are  fixed  for  each  cell,  and  that  one  can  precisely 
describe  the  capacity  of  each  member  of  the  population  (of  infected  cells)  to  produce  virions 
as  a  function  of  time.  More  precisely,  exactly  t\  units  of  time  after  a  cell  becomes  infected, 
it  begins  producing  virus.  Exactly  T2  units  of  time  later,  that  same  cell  then  becomes 
chronically  infected  (assuming  it  lives  to  this  stage). 

From  a  biological  viewpoint,  it  is  unlikely  that  all  cells  have  precisely  the  same  delay 
times  in  their  production  characteristics  or  conversion  to  a  chronically  infected  stage.  To 
accommodate  some  variability  expected  in  such  biological  populations,  let  the  delay  in  the 
first  equation  in  (21)  be  modeled  by  treating  the  delay  time  r  between  acute  infection  and 
viral  production  as  a  probabilistic  quantity  (i.e.,  a  random  variable)  with  distribution  T’i(r) 
so  that  the  first  equation  in  (21)  is  replaced  by  (see  the  appendix  of  [BBH]  for  a  more 
detailed  discussion  of  the  foundations  underlying  such  an  equation) 

V(t)  =  -cV(t)  +  ha  f  A(t  +  T)dPi(r)  +  ncC(t)-p(V(t),T(t)).  (22) 

J  —  OO 
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Notation 

Description 

c 

Infectious  viral  clearance  rate 

nA 

Infectious  viral  production  rate  for  acutely 
infected  cells 

nc 

Infectious  viral  production  rate  for  chronically 
infected  cells 

7 

Rate  at  which  acutely  infected  cells  become 
chronically  infected 

rv 

Birth-rate  for  virally  infected  cells 

ru 

Birth-rate  for  uninfected  cells 

6a 

Death-rate  for  acutely  infected  cells 

6c 

Death-rate  for  chronically  infected  cells 

6U 

Death-rate  for  uninfected  cells 

6 

Density  dependent  overall  cell  death-rate 

p 

Rate  of  infection 

S 

Constant  rate  of  target  cell  replacement 

Table  1:  in  vitro  model  parameters 


The  function  p(V,T ),  where  x  >-)•  p(x),  x  =  (V,T),  is  globally  Lipschitz  as  hypothesized  in 
(HI)  in  Section  2  above.  For  the  efforts  here  and  in  [BBH],  the  function  p  is  assumed  to  be 
locally  bilinear,  i.e.,  p(V,T)  =  pVT  before  saturation  and  constant  or  linear  thereafter  (see 
[BBH]). 

Likewise,  let  the  delay  between  acute  infectivity  and  chronic  infectivity  (with  distri¬ 
bution  P2(t))  be  represented  in  altered  forms  of  the  second  and  third  equations  of  (21) 
by 


A (t)  =  (rv- 5a- 5X(t))A(t)  —7  f  A(t.  +  t)cIP2(t)  +  pV{t)T(t)  (23) 

J  —  OO 

C(t)  =  (rv-6c-6X(t))C(t)+1  f  A(t  +  T)dP2{r).  (24) 

J  —  OO 

The  resulting  model  becomes  the  special  case 

V(t)  =  ~cV(t)  +  nAj°_OQA(t  +  T)ki{T)dT  +  ncC{t) -pV(t)T(t) 

A(t)  =  (rv-SA-SX(t))A(t)-'yf°00A(t  +  T)k2(T)dT+pV(t)T(t)  ^ 

C(t)  =  (rv  -  Sc  -  6X(t))C(t)  +7  f°ooA(t  +  T)k2{T)dT 

T(t )  =  (ru  -  Su  -  8X(t)  -  PV(t))T(t)  +  S  , 

whenever  Pi,  P2  possess  probability  densities  k\ ,  k2  respectively.  In  the  discussions  in 
[BBH] ,  all  numerical  simulations  for  each  of  the  systems  of  functional  differential  equations 
given  above  were  performed  using  the  methods  described  in  Section  2  with  piecewise  linear 
splines  for  the  states.  There  it  was  found,  not  surprisingly,  the  presence  of  nonzero  delays 
has  a  dramatic  effect  upon  the  simulations.  Issues  related  to  the  exact  nature  of  r  and 
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whether  or  not  it  should  be  modeled  as  a  fixed  value  for  every  cell  or  distributed  across  the 
cell  populations  and  how  this  distribution  can  be  represented,  as  well  as  further  evidence 
of  the  statistical  significance  of  the  presence  of  the  delays  are  the  focus  of  discussions  in 
[BBH], 

The  variables  V  and  C  in  the  above  model  are  actually  expected  values.  To  explain  this, 
we  first  consider  the  delay  between  initial  acute  infection  and  initial  chronic  infection  of  a 
cell.  It  is  biologically  unrealistic  to  expect  an  entire  population  of  cells  to  simultaneously 
change  infection  characteristics  fi 2  =  ri  +  t2  {p, 2  >  0)  hours  after  initial  viral  infection. 
Therefore,  suppose  that  the  delay  between  initial  acute  infection  and  chronic  infection  varies 
across  the  cell  population  (thus  mathematically  characterizing  the  intercellular  variability) 
according  to  a  probabilistic  distribution  P2  with  density  k2-  We  denote  by  C(t;r)  the 
subpopulation  consisting  of  chronically  infected  cells  that  either  maintained  their  acute 
infection  characteristics  for  r  time  units  or  are  the  progeny  of  those  same  cells.  In  other 
words,  for  some  r  >  0,  there  exists  a  subpopulation  C(t ;  r)  of  the  chronically  infected  cells 
which  either  spent  r  hours  as  acutely  infected  cells  (before  converting  to  chronically  infected 
cells)  or  are  descendants  of  cells  that  spent  exactly  r  hours  as  acutely  infected  cells.  Thus, 
as  derived  carefully  in  [BBH]  one  has 

C(t)  =  £2[C{t]T)\  =  C(t;T)fa(r)dT;  (26) 


where 


C  (t;r)  =  (rv  -  Sc  -  5X  (t))C  (t;r)  +  ^A(t  -  r)  , 

with 

X(t)  =  A(t)  +  C(t)  +  T(t). 

Integration  of  this  equation  over  the  distribution  P2,  over  all  possible  delays,  yields  the 
equation  for  C,  the  expected  value  of  the  population  of  chronic  cells,  given  by 

C(t)  =  £2[C(t;r)\ 

=  (rv  -  Sc  -  SX  (t))C  (t)  +  7 /0°°  A(t  -  t)  k2(r)  dr  (27) 

C-(O)  =  Co, 

where  Co  is  the  initial  condition  for  the  total  chronically  infected  cell  population.  A  simple 
change  of  variables  in  the  integral  term  as  described  below  results  in  the  third  equation  of 

(25)' 

A  similar  argument  can  be  made  for  the  delay  between  viral  infection  and  viral  pro¬ 
duction  for  the  acutely  infected  cells  A{t).  One  supposes  that  the  delay  between  infection 
and  production  (for  acutely  infected  cells  A(t))  varies  across  the  population  with  proba¬ 
bility  distribution  Pi  and  corresponding  density  k\  and  partitions  the  expected  total  viral 
population  V  into  those  virions  Va  produced  by  acutely  infected  cells  and  those  virions  Vc 
produced  by  chronically  infected  cells.  We  then  denote  by  Vh(t;  r)  the  subpopulation  of 
virus  which  are  produced  by  an  acutely  infected  cell  r  hours  after  being  infected.  Thus,  the 
rate  of  change  in  this  subgroup  of  virions  is  governed  by 

VA(t;r )  =  -cVA{t\T)  +nAA(t  -  t)  -  pVA{t-,T)T(t) . 


20 


To  obtain  the  (expected)  number  of  virus  at  time  t  that  have  been  produced  by  acutely 
infected  cells,  we  must  integrate  over  the  distribution  Pi,  over  all  possible  delays 

POO 

VA{t)  =  £i[Va(P,t)\  =  /  VA(t-,T)ki(T)d,T , 

Jo 

which  yields  the  governing  equation  for  this  larger  subpopulation  of  virions 

VA(t)  =  £i[VA(t-,r)} 

POO 

=  - cVA(t)  +  nA  A(t  —  T)k\(j)dT  —  pVA(t)T(t) . 

Jo 

To  account  for  the  chronically  infected  cells  as  a  source  of  virions,  we  denote  Vc  as  the  sub¬ 
population  of  virions  produced  by  chronically  infected  cells.  Thus  the  equation  describing 
the  rate  of  change  in  the  size  of  this  subpopulation  is 

Vc{t)  =  -cVc(t)  +  ncC(t)-pVc(t)T(t), 


where  the  expected  value  C  of  the  total  population  of  chronically  infected  cells  is  defined  in 
(26).  Therefore,  the  governing  equations  for  the  total  population  of  virus  are  described  by 


m 


V(0) 


Si[VA  (t-r)  +  Vc{t)\ 

POO 

-c{VA(t) +  Vc(t))  +  nA  /  A(t  -  T)ki(r)dr  +  ncC(t) 

Jo 

-p(VA(t)  +  Vc(t))T(t) 


—cV  ( t )  +  nA 


A(t  —  T)k\(r)dT  +  ncC(t )  —  pV(t)T(t) 


Vo, 


where  Vo  is  the  initial  condition  for  the  total  virions  population. 

If  one  assumes  that  the  A  and  T  subclasses  have  no  subpopulation  structures,  and  are 
therefore  governed  by 

POO 

A(t)  =  (rv  -  dA  —  SX  (t))  A  (t)  —  7  /  A  (t  —  r)  k2  (r)  dr 

Jo 

+pV(t)T(t ) 

4(0)  =  40 

T{t)  =  (ru  —  5U  —  SX  (t)  —  pV  (t))  T  (t)  +  S 

T  (0)  =  T0, 

with  initial  conditions  Aq  and  To,  we  are  subsequently  led  to  the  model  equations  (25)  by 
making  the  change  of  variables  ki(^)  =  ki(—£)  so  that  the  densities  are  now  defined  on 
(—oo,0)  instead  of  (0,oo)  . 

In  summary,  the  theoretical  and  approximation  methodology  of  Section  2  provide  a 
sound  foundation  for  inverse  problem  investigations  such  as  those  of  [BBH],  Among  the 
important  findings  for  the  models  developed  are:  (i)  an  excellent  fit  to  in  vitro  experimental 
data;  (ii)  a  rigorous  model  comparison  statistical  analysis  to  support  importance  of  delays 


21 


(statistical  significance  in  improving  fits-to-data)  in  the  models;  (iii)  analysis  to  show  that 
the  models  with  discrete  delays  yield  essentially  similar  dynamic  responses  to  those  from 
models  with  continuous  delays.  This  last  finding  is  important  biologically  since  it  is  highly 
unlikely  that  all  cells  in  a  population  can  respond  with  fixed  uniform  delays. 

3.3  Example  3:  The  drinking  behavior  control  system  (DBCS) 

Researchers  studying  alcohol  abuse  and  addiction  have  collected  vast  amounts  of  information 
on  substance  use,  participant’s  willingness  to  change  behavior,  and  participant’s  success 
in  a  particular  treatment.  Many  hypotheses  have  been  formulated  concerning  possible 
(difficult-to-measure)  factors  that  control  a  patient’s  motivations  and  behavior,  such  as 
the  relative  importance  of  drinking  in  their  lives,  commitment  to  reducing  their  alcohol 
consumption,  and  recognition  of  reasons  to  not  use  a  substance,  to  name  a  few.  However, 
the  relative  contributions  of  these  possible  mechanisms  for  behavior  change  are  unclear. 
The  interplay  among  these  factors  as  they  change  over  time  is  a  natural,  albeit  difficult, 
question  to  address  via  dynamical  mathematical  models.  In  order  to  better  understand  these 
ideas  in  a  quantitative  context  and  to  identify  underlying  mechanisms  governing  drinking 
behavior  in  problem  drinkers  during  therapy,  we  have,  in  joint  efforts  [BRSDHKM]  with  a 
team  of  psychologists  at  Columbia  University,  attempted  to  model  behavior  control  systems 
informed  by  a  dataset,  Project  MOTION.  We  present  here  an  initial  model  developed  in 
these  collaborations. 

Further  details  on  Project  MOTION  and  the  data  collected  can  be  found  in  [BRSDHKM], 
Briefly,  approximately  90  participants  were  assigned  to  one  of  three  therapy-based  treat¬ 
ment  groups.  In  addition  to  attending  the  four  therapy  sessions  over  an  eight  week  period, 
patients  were  directed  to  call  an  interactive  voice  recording  (IVR)  system  and  answer  a 
survey  of  41  questions  every  day  during  the  evening  for  eight  weeks.  Our  modeling  efforts 
focused  on  the  data  from  the  IVR  surveys  since  there  are  numerous  longitudinal  time  points, 
which  we  anticipated  would  be  more  informative  of  the  underlying  dynamic  processes. 

The  41  questions  of  the  IVR  are  divided  into  topical  groups  in  the  survey  form.  Each 
group  has  its  own  scale  by  which  a  participants’  numerical  responses  are  interpreted.  Since 
it  is  prohibitive  to  construct  an  initial  model  from  so  many  variables,  we  averaged  responses 
from  similar  categories  which  led  to  conceptual  variables.  Among  the  conceptual  variables  we 
considered  based  on  the  IVR  data  were  stressful  events,  pleasant  events,  pressure  to  drink, 
current  mood,  perceived  stress,  desire  to  drink,  commitment  to  not  drink  for  the  next  24 
hours,  confidence  and  commitment  to  reduce  drinking  for  the  next  24  hours,  guilt  concerning 
drinking  behavior,  and  alcohol  consumption.  The  models  were  then  developed  based  on 
these  variables.  During  the  formulation  of  these  models  based  on  the  longitudinal  data, 
we  determined  that  delays  and  cumulative  effects  are  important  and  should  be  included  in 
order  to  accurately  reflect  the  dynamic  changes  in  a  person’s  behavior. 

In  one  of  our  preliminary  models,  we  focused  on  three  state  variables:  desire  to  drink  or 
Desire,  denoted  as  D{€)\  the  extent  to  which  the  subject  feels  their  drinking  over  the  recent 
past  was  excessive,  referred  to  in  short  as  Guilt,  G(t);  and  alcoholic  beverage  consumption 
rate  or  Alcohol,  A(t).  The  alcohol  consumption  function  A(t)  describes  the  rate  at  which 
the  participant  is  consuming  alcoholic  beverages  and  has  units  of  drinks/time.  In  contrast 
the  other  two  state  variables  are  unit-less,  measuring  the  extent  to  which  a  subject  agrees 
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with  a  statement  on  the  intensity  of  their  feelings  on  a  particular  subject.  One  preliminary 
simplified  model  had  the  form 

^A(i)  =  -012  G(t  +  s)K1(s)ds^j  +  ai3X{D>o}D(t) 

Jt° (t)  =  «2i  (A(t  —  T\)  —  Aq)  (28) 

6‘TP(g^/  G'(t  +  s)K2(s)ds^  -G*D2  , 

where  ki(s)  =  K2 (s)  =  ,  and  r  =  2,  indicating  that  behavior  over  the  past 

two  days  has  an  impact  on  current  behavior.  We  included  an  indicator  function  X(D> o) 
to  reflect  that  only  when  a  person  desires  alcohol  does  his/her  drinking  behavior  change. 
Additionally,  we  enforced  the  conditions  A(t)  >  0  and  G(t )  >  0,  where  for  all  variables  a 
value  of  0  indicates  a  neutral  value  (the  scaled  variables  had  values  in  the  range  [—2,2]). 
For  example  D(t)  =  0  indicates  neither  a  desire  nor  a  dislike  for  alcohol,  and  G(t)  =  0 
indicates  that  a  person  feels  no  particular  feelings  of  guilt  or  virtue. 

Interestingly,  it  appeared  that  one  patient’s  drinking  pattern  could  be  reasonably  de¬ 
scribed  when  considering  just  two  variables:  the  alcohol  consumption  rate  A(t),  and  the 
guilt  G(t).  A  key  observation  was  that  the  individual’s  drinking  was  driven  by  an  innate  re¬ 
ward/desire  mechanism.  This  mechanism  is  the  ingrained  desire  for  drinking  that  separates 
problem  drinkers  from  those  who  drink  casually  (or  less).  It  is  known  that  animals  and 
human  beings  in  particular  yearn  for  something  representing  a  reward,  with  the  specific 
reward  varying  among  individuals.  Examples  include  food  for  some  people,  smoking  for 
others,  etc.  In  addition,  it  is  known  that  individuals  learn  to  turn  this  desire  off  when  the 
reward  is  unavailable  or  they  have  decided  they  cannot  indulge.  So  in  addition  to  being 
a  mechanism  for  desire  it  can  also  have  the  effect  of  controlling  or  limiting  one’s  intake  of 
alcohol. 

The  model  resulting  from  analysis  of  the  data  for  this  patient  in  such  a  context  is  given 

by 

— A(t)  =  -auX{G>G*}(G(t)  -  G*)  +  ai3h(t) 

d  r  r° 

—G(t)  =  a2i  J  A{t  +  s)ds  —  (1  +  cx{w(t)})A*  (29) 

where  the  function  h(t)  represents  the  subject’s  desire/reward  mechanism,  which  increases 
going  into  the  weekend  (to  turn  it  ‘on’),  and  decreases  coming  out  of  the  weekend  (thereby 
turning  it  ‘off’).  This  particular  individual  allowed  himself  to  drink  on  the  weekend  as  long 
as  he  refrains  during  the  week,  so  h(t)  has  the  form 

1.5  <  t  <  2  (Friday  a.m.  through  Friday  p.m.) 

2  <  t  <  2.5  (Friday  p.m.  through  Saturday  a.m.) 

3.5  <  t  <  4  (Sunday  a.m.  through  Sunday  p.m.) 

4  <  t  <  4.5  (Sunday  p.m.  through  Monday  a.m.) 

otherwise, 


'  2(t-  1.5) 
— 2(t  —  2.5) 
h(t)  =  — 2(t  —  3.5) 

2{t  —  4.5) 
0 


jtm  =  “°32  1 
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where  t  =  t  mod  7. 

The  effect  of  ‘guilt’  decreases  the  individual’s  drinking  rate  only  once  it  surpasses  a 
certain  ‘threshold’  level  G*.  In  contrast,  the  effect  of  alcohol,  or  specifically  the  number 
of  drinks  consumed  in  the  recent  past  f°r  A(t  +  s)ds  can  decrease  one’s  guilt  if  it  is  below 
a  certain  acceptable  level,  A*  during  the  week  and  (1  +  c)A*  during  the  weekend  (this  is 
implemented  through  the  week/weekend  characteristic  function  X{w(t)})-  If  it  surpasses 
these  levels  which  the  individual  has  rationalized  as  acceptable,  he  feels  that  his  drinking 
was  excessive,  with  the  extent  of  that  feeling  proportional  to  how  much  his  recent  drinking 
has  surpassed  those  levels. 

3.3.1  Simulated  inverse  problem 

Not  surprisingly,  most  of  the  parameters  are  unknown  in  the  above  delay  model,  and  it  is 
of  great  interest  to  be  able  to  estimate  them  using  longitudinal  data.  To  investigate  our 
ability  to  do  so,  we  generated  some  data  with  various  levels  of  added  noise,  representative 
of  that  in  the  IVR,  on  the  drinks  consumed  over  the  past  day  rlj  ~  /'  (tj)  =  f°1  A(tj  +  s)ds 
and  the  extent  to  which  they  felt  that  was  excessive  dj  «  f2{tj )  =  G(tj).  The  generated 
data  has  been  computed  as 

d }  =  fitj)  +  Y^meanjif1  (tj)}Af(0, 1), 

d2j  =  )  +  Y^meani{/2(ii)}-^(°>  !). 

where  j  =  1, ,  K ,  and  k  =  1,5,  or  10,  corresponding  to  1%,  5%  or  10%  error.  Here  A7(0, 1) 
is  a  standard  Gaussian  with  mean  zero  and  unity  variance.  All  computations,  including  the 
generation  of  data  for  use  in  the  inverse  problem,  were  done  using  piecewise  linear  splines 
with  N  =  32  to  solve  the  delay  system.  The  relevant  timescale  for  the  IVR  data  appears  to 
be  ‘triweekly’,  as  daily  data  exhibits  too  much  variation  so  that  the  trends  were  not  obvious, 
and  too  much  information  was  lost  if  the  daily  responses  were  averaged  over  a  week.  We 
note  that  averaging  is  preferred  to  summing  with  these  data,  to  lessen  the  impact  of  missing 
responses.  Drinking  behavior  with  most  individuals  is  starkly  different  on  the  weekends  as 
opposed  to  weekdays,  so  one  of  the  time  intervals  of  the  triweekly  time  scale  begins  on 
Fri  evening  and  lasts  to  the  time  of  the  call  on  Sun  evening.  The  other  two  time  intervals 
are  Sun  evening  to  Wed  evening,  and  then  Wed  evening  to  Fri  evening.  A  comparison  of 
noise- less  (0%  error)  data  and  that  with  10%  error  is  shown  in  Figure  2.  Parameter  values 
and  initial  conditions  used  to  generate  these  data  are  given  in  Table  2. 


<3 12 

0.15 

A* 

1.5 

a  1 3 

16 

r 

1 

<321 

0.8 

M  o) 

1 

c 

4.25 

G(0) 

-0.5 

Table  2:  Parameter  values  and  initial  conditions  used  to  generate  the  data.  Initially,  A(cf> )  = 
1,  G((f>)  =  —0.5  for  (f>  €  [— r,  0). 
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Figure  2:  Generated  data  with  0%  and  10%  error.  Contained  in  the  left  panel  is  the  number 
of  drinks  over  a  day  dj  ~  A(tj  +  s)ds  and  in  the  right  is  the  guilt  dj  ~  G(tj).  These  are 
compared  to  the  model  solution  (solid  lines)  with  no  noise. 


Specifically,  we  discuss  here  our  ability  to  estimate  parameters  6  =  (013,021  )T  and 
6  =  (ai3,r)T  where  r  is  the  length  of  time  that  the  drinking  may  influence  the  individual’s 
‘guilt’.  The  period  of  time  over  which  the  individual  may  be  reflecting  to  influence  that 
response  may  not  always  be  known. 

Estimates  9  to  parameters  are  obtained  by  minimizing  the  ordinary  least  squares  func¬ 
tional 

K 

J(6)  =  J2 \d)  -  f%;  0)|2  +  |  d)  -  f\tj- 6)\2 
3= 1 

over  the  feasible  parameter  set  0  C  M2. 

Values  of  estimates  for  9  =  (ai3,02i)r  with  known  cumulative  effect  r  =  1,  and  for 
9  =  (ai3,  r)T  are  contained  in  Table  3  for  1,  5,  and  10%  noise,  respectively.  Since  parameter 
estimates  are  relatively  close  to  their  ‘true’  values,  or  the  values  that  were  used  to  generate 
the  data,  it  appears  that  we  could  reasonably  expect  to  estimate  these  parameters  from 
similar  data.  The  fit  to  10%  noisy  data  is  shown  in  Figure  3  when  estimating  the  length 
of  time  interval  r  and  parameter  013.  Fits  when  estimating  either  9  =  (ai3,d2i)r  or  9  = 
(a i3,r)r  are  comparable,  as  suggested  by  the  RSS  values. 

The  many  influences  behind  drinking  behavior  are  difficult  to  identify  and  relate  in  a 
quantitative  manner.  However,  our  initial  modeling  efforts  have  shown  behavior  similar 
to  that  seen  in  data  from  Project  MOTION.  Performing  the  inverse  problem  on  simulated 
data  from  the  models  such  as  (28)  and  (29)  is  not  a  trivial  task,  but  the  above  results 
and  other  computational  results  suggest  that  this  will  be  feasible  with  appropriate  data. 
Current  investigations  into  the  sensitivity  of  the  model  with  respect  to  the  parameters  and 
discrete  delays  will  provide  information  regarding  which  parameters  are  most  influential  on 
model  behavior. 
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013 

021 

RSS 

Ol3 

r 

RSS 

True  value 

16 

0.8 

16 

i 

Initial  guess 

14 

1 

14 

1.2 

0(1) 

15.999 

0.8001 

0.0397 

15.9994 

1.0001 

0.03967 

9(5) 

16.011 

0.79928 

0.688 

15.9942 

1.00196 

0.6746 

0(10) 

15.9880 

0.80119 

3.588 

16.0043 

1.0001 

3.5865 

Table  3:  Parameter  estimates  9  =  (ai3,a2i)r  on  the  left,  and  9  =  (013,  f)T  on  the  right 
and  the  residual  sum  of  squares  RSS  from  fitting  generated  data  with  1,  5,  and  10%  error, 
respectively.  The  superscript  in  the  symbol  9  indicates  the  level  of  error. 


Figure  3:  Best  fit  solutions  when  estimating  fh10)  =  (013,  r)T  compared  to  generated  data 
with  10%  error. 

4  Concluding  Remarks 

In  this  paper  we  have  presented  new  theoretical  results  for  inverse  problems  involving  general 
nonlinear  nonautonomous  delay  systems  with  functional  (time-dependent  rate  functions, 
probability  distributions)  parameters.  An  approximation  framework  that  entails  approx¬ 
imation  of  both  the  infinite  dimensional  dynamical  state  spaces  and  infinite  dimensional 
parameter  sets  is  given.  This  provides  a  rigorous  foundation  for  a  wide  class  of  problems 
arising  in  applications.  We  illustrate  the  ideas  by  brief  discussions  of  examples  from  insect 
populations  with  time  dependent  maturation  and  death  rate,  cellular  level  HIV  models  with 
uncertainty  in  process  delays,  and  models  for  changing  behavior  in  response  during  alcohol 
therapy.  We  also  demonstrate  the  efficacy  of  the  approximation  methods  with  computations 
for  the  behavioral  models. 
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